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ABSTRACT 

We present a new cluster detection algorithm designed for the Panoramic Survey Telescope 
and Rapid Response System (Pan-STARRS) survey but with generic application to any multi- 
band data. The method makes no prior assumptions about the properties of clusters other than 
(a) the similarity in colour of cluster galaxies (the "red sequence") and (b) an enhanced pro- 
jected surface density. The detector has three main steps: (i) it identifies cluster members by 
photometrically filtering the input catalogue to isolate galaxies in colour-magnitude space, 
(ii) a Voronoi diagram identifies regions of high surface density, (iii) galaxies are grouped 
into clusters with a Friends-of-Friends technique. Where multiple colours are available, we 
require systems to exhibit sequences in two colours. In this paper we present the algorithm 
and demonstrate it on two datasets. The first is a 7 square degree sample of the deep Sloan Dig- 
ital Sky Survey equatorial stripe (Stripe 82), from which we detect 97 clusters with z ^ 0.6. 
Benefiting from deeper data, we are 100% complete in the maxBCG optically-selected clus- 
ter catalogue (based on shallower single epoch SDSS data) and find an additional 78 pre- 
viously unidentified clusters. The second dataset is a mock Medium Deep Survey (MDS) 
Pan-STARRS catalogue, based on the ACDM model and a semi-analytic galaxy formation 
recipe. Knowledge of galaxy-halo memberships in the mock allows a quantification of al- 
gorithm performance. We detect 305 mock clusters in haloes with mass > IQ^^h^"^ Mq at 
z ^0.6 and determine a spurious detection rate of < 1%, consistent with tests on the Stripe 82 
catalogue. The detector performs well in the recovery of model ACDM clusters. At the median 
redshift of the catalogue, the algorithm achieves > 75% completeness down to halo masses 
of lO^'^-^/i~^M0 and recovers > 75% of the total stellar mass of clust ers in haloes down 
to lO^^-^/i"Uf0. A companion paper ( |Geach, Murphy, & Bower"201l' hereafter GMBll) 
presents the complete cluster catalogue over the full 270 deg"^ Stripe 82 catalogue. 

Key words: catalogues - galaxies: clusters: general - cosmology: observations - cosmology: 
large-scale structure of universe 



1 INTRODUCTION 

Galaxy clusters are integral tools in our drive to test the ACDM cos- 
mological model and our understanding of galaxy formation. The 
evolution of the cluster population with redshift for example, can 
impose important constraints on the matter density of the universe 
( |Carlberg eta T 1996; Evrard 1997; Schue cker et al.|2003|l and the 
growth of primordial density fluctuations iFrenk et al. 1990; W hite,| 
[Efstathiou & Frenk.1993, Fedeli, Moscardini & Matarrese.2009r 
The deep potential wells of clusters offer a suite of laboratories 
within which detailed studies of gas-galaxy interactions are possi- 
ble. There is evidence that clusters have been in place for a signif- 
icant fraction of the star-forming history of the universe, meaning 
they can provide a unique insight into the how environmental ef- 
fects shape the evolutionary path of galaxies. 

The cluster mass budget is dominated by the presence of dark 



E-mail: david.murphy@durham.ac.uk 



matter (~85%, for a comprehensive review see 'Voit'2005'), making 
them ideal sites for identifying strongly-lensed background galax- 
ies jSmail et al.|2007[> and thus provide glimpses of the early star- 
forming universe i Swinbank et al.|20T ). Weak lensing studies can 



determine the projected mass distribution of clusters (e.g. ^Sheldon 



et al. 2004 ) and in some cases the dark matter itself (jClowe et al. 



2006^ . Hot intracluster gas also leaves an imprint on the cosmic 
microwave background (CMB) by way of the Sunyaev-Zel'dovich 
(SZ.'Suny aev & Zeldovich|1980[[CaFlstrom, Holder & Reese|2002[ > 
effect via the inverse Compton scattering of CMB photons. At the 
megaparsec scale, clusters act as high-mass lamp-posts between the 
filamentary connected structure tracing out the cosmic web ([Pimb-| 



blet & Drinkwater 2004{ Colberg, Krughoff & Connolly] 2005| 
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There is therefore great merit in producing a homogeneous 
cluster census of the Universe, and much effort has gone into 
producing comprehensive cluster surveys. Efforts to this end are 
broadly separated into two wavelength domains: the optical-near- 



2 Murphy et al. 



IR and X-ray. We note in passing tliat cluster detection by SZ- 
decrement in the microwave is an emerging cluster survey tech- 
nique that holds promise at high redshift fMcInnes et al. "2009" 
IBrodwin et al.|2010[|Hincks et al.|2010, Vanderlinde et al. 2010) . 

X-ray detections exploit the hot intracluster gas accounting 
for the bulk of the cluster baryonic mass component (I Cavaliere] 
|& Fusco-Femiano|[T976l [Allen, Schmidt & F abian 2002}. X-ray 
selected cluster catalogues tend to be robust to projection effects, 
probe large volumes and produce a cluster sample with well char- 
acterised masses. Cluster catalogues from large area X-ray surveys 
(e.g., [Ebeling et al.|[T998) identify bright, massive clusters, with 
their deep potential wells establishing the high electron densities 
required for strong X-ray emission. Whilst a cursory glance in the 
X-ray unveils the presence of high mass systems, to select those 
with lower masses, unresolved gas components, distant or gas-poor 
clusters, one must look to alternative approaches. 

There has been a half-century history of cluster identification 
in the optical regime. Early 'eyeball' surveys of photographic plates 
produced the earliest cluster catalogues I Abell|1958||Zwicky, Her-| 



|zog & Wild|1961||Abell, Corwin & 01owin|1989> and allowed the 
first statistical study of the cluster population. When cluster and 
group samples were later constructed with the help of digitised pho- 
tographic plates (such as the APM; [D"alton et al.|1992^ and galaxy 
spectra ( |Eke et al.|2004^ , the task of identification passed from hu- 
man to machine. With the advent of wide-field multi-band CCD 
imaging, assembly of vast galaxy samples has become the standard. 
For example, Sloan Digital Sky Survey (SDSS; ^York et al.^2000j 
optical imaging data has vastly increased both the volume and de- 
tail of detected astronomical sources, to date generating five-band 
ugriz photometry for ~ 230 million objects (DR7, Abaz ajian et al.| 
[2009). Although one can estimate galaxy redshifts photometrically 
based on SED template fitting i jCsabai et al.|2003| l, neural networks 
I CoUister & Lahav|2004 or a combination of the two ( |Abazajian| 
|et al.|20d9 §4.6), photo-zs are prone to large uncertainties and are 
generally unsuitable for accurate 3D reconstructions of the galaxy 
distribution (although for recent approaches using the entire photo- 
metric redshift distribution, see |Liu et al.|2008l l. 

Armed with only the angular positions of galaxies, automated 
algorithms have been developed to identify clusters as projected 
overdensities in the plane of the sky ( [Lidman & PetersOTil|1996| 
[Postman et al.|1996) . These often come at the expense of model 
dependency and sensitivity to the boundaries and holes common in 
galaxy catalogues. More geometric approaches have made use of 
the Voronoi Tessellation (VT) to map the projected density distri- 
bution of galaxies. Using the Voronoi cell area as a proxy for the 
local galaxy density, VTs were first suggested as a non-parametric 
means of astrophysical source detection by 'Ebeling & Wie den^ 
[maim[ ( [T993[ l, and later cluster detection in Ramella et al. (2 001[ l. 
Voronoi techniques have also been used in void detection |Ryden[ 
[1995[[E1-Ad, Piran & da Costa[1996[l and the i dentification of large 
scale structure pcke & van de Weygaert[1991^ . However, these ap- 
proaches tend to suffer from contamination arising from the inclu- 
sion of background and foreground field galaxies. 

[Gladders"& Yee ( 2000 ) proposed a powerful method that picks 
out the near ubiquitous signature of galaxy clusters from photomet- 
ric surveys. Star formation rates of galaxies bound in the potential 
wells of clusters are suppressed when the cold gas supply is de- 
pleted by environmentally-driven stripping or starvation processes 
( [Balogh, Navarro & Morrisl[2000^ . The passively evolving stellar 
populations in these galaxies develop strong metal absorption lines 
blueward of 4000A giving rise to a break, or step, in their spec- 
tra. In broad-band photometric filters, these cluster members ap- 



pear nearly uniformly red between the bands that straddle the spec- 
tral break. Because cluster galaxies occupy a wide range of masses 
(luminosities) these characteristic colours produce a distinct ridge- 
line, or "red sequence" (B ower, Lucey & Ellis[[1992) in colour- 
magnitude space. The dichotomy between this quiescent popula- 
tion of predominantly E/SO galaxies and the star-forming popula- 
tion of spiral-dominated field galaxies is observed as a bi-modality 
of galaxy colours. With increasing redshift, the 4000A break moves 
redward; the Gl adders & Yee (20001 prescription for cluster de- 
tection exploits both the strong colour bi-modality in the galaxy 
distribution, and the colour-redshift relation to isolate clusters of 
galaxies over a range of epochs. 

With a growing body of infrared data (specifically, the IRAC 
cameras on-board the Spitzer Space Telescope), efforts such as the 
Spitzer Adaptation of the Red-Sequence Cluster Survey (SpARCS, 
[Wilson et al.[[2009] l have already turned to pushing red-sequence 
cluster searches beyond the optical/NIR regime. With evidence of 
cluster sequences in place up to z ~ 1.5 jPapo vich et a ir|[2010[ 
Hayashi et al.|20lT} and p erhaps as early as z = 3 ^Kodama et al.[ 



2007 



Doherty et al.|2010 1, tracking the 4000A break further red- 
ward shows great potential in filling the 1.4 < z < 2.2 cluster 
desert. These distant systems may potentially hold some crucial 
clues for our understanding of galaxy formation and evolution. 

Future observational campaigns such as the Large Synoptic 
Survey Telescope (LSST; [lvezic et al.|2008[ l are set to push forward 
the frontiers of wide-area, deep multi-ba nd optical sky su rveys. 
More immediately Pan-STARRS-lp]:PS-l; |Kaiser et al.|2002 ), the 
first of four 1.8m telescopes, is currently imaging 3/4 of the sky 
with deep, and well characterised ( [Stubbs et al. [[2010 1) five-band 
photometry. Algorithms capable of processing the petabyte-scale 
sky surveys of these next-generation facilities will be best placed 
to supply data products fully exploiting their advances. Cluster se- 
lection by red sequence is set to remain highly relevant to the con- 
struction of cluster catalogues using these forthcoming surveys. 

One approach to cluster detection in these deeper datasets is 
through "matched-filter" (MF; |Postman et al.[1996[ l algorithms that 
distill the large body of collected cluster data into a likelihood func- 
tion, recovering systems by maximising the likelihood of survey 
data fitting the model. In particular, these filters may specify the 
cluster luminosity function, radial density distribution, behaviour 
of the red sequence ridgeline and in some cases the presence of 
a central Brightest Cluster Galaxy (BCG) (maxBCG; rKoester et aT] 
[2007b| l. MF algorithms often confer redshift and richness estimates 
as part of the detection procedure. The MF technique has been suc- 
cessful in extracting cluster signals from a diverse range of galaxy 
surveys, including the SDSS jGoto et al.|2002[[Koester et al.|2007al > 
and Canada France Hawaii Telescope Legacy Survey (CFHTLS; 
[Gladders & Yee|2005[[Thanjavur, WilUs & Crampton|2009[ l. The 



naxBCG SDSS cluster catalogue iKoester et al. |2007a| l has facili- 
tated a more detailed study of the cluster red sequence ( [Hao et aT] 
2009 ), which may in turn provide added refinements to future algo- 
rithms. 

However, the advantage of MF algorithms can also be their 
drawback: such techniques will preferentially recover the clusters 
they are designed to match, but those not fitting the model are 
less likely to be identified. Many matched filter approaches also 
are based on uniform background galaxy distributions, and experi- 
ence a degraded performance jKim et al.|2002[ > under more realistic 
backgrounds. 
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Our cluster detection philosophy is designed to be dis- 
tinct from, but entirely complementary to the variety of matched 
filter algorithms available. This study relaxes theoretically and 
observationally-motivated constraints, permitting a broader explo- 
ration of systems with projected overdensities. Specifically, we do 
not assume cluster red sequences occupy a particular position in 
colour-magnitude space, nor do we stipulate preferred distributions 
for the projected position of cluster members on the sky. Through 
this approach we hope to provide both an independent catalogue 
of clusters and a means to refine our understanding of character- 
istic cluster properties. The lack of selection criteria in our algo- 
rithm permits a double-check of the detections, since we can ask 
if the identified system conforms to our expectations. As we shall 
later demonstrate (see !j5]and Figure [T7|, the prescription presented 
here may lead to improved recovery of certain systems and bet- 
ter agreement with X-ray cluster data. Moreover, because our pro- 
posed technique makes only two assumptions about cluster prop- 
erties, it is sensitive to a wide range of clusters, including aspher- 
ical/asymmetric systems in the process of merging ( [Clowe et al.| 
[2006l > and fossil groups jSchirmer et al.|[2010) with luminosity 
functions unlike a |Schechter| ( |1976| l function. 

In this paper, we present our detection prescription, which 
involves a blind scan of colour-magnitude space (to locate clus- 
ter sequences) and a Voronoi tessellation technique (to estimate 
the galaxy surface density distribution). Requiring only two bands 
to detect spectral breaks, our approach provides a very efficient 
method of detecting clusters in wide-area CCD imaging of the sky. 
Whilst algorithms have in the past used Voronoi tessellations to find 
clusters, previous attempts either do not exploit the red sequence 
or instead use photometric redshift distribution functions that rely 
sensitively on the absolute calibration and number of photometric 
bands ( |van Breukelen & Clewley 2009l[Soares-Santos et al.|201 1| |. 
In this paper we describe the algorithm and apply it to a 7 square de- 
gree sample of SDSS Stripe 82 data. A companion paper (GMBl 1) 
presents the full Stripe 82 catalogue covering the full 270 square 
degrees. 

The outline of this paper is as follows. In section|2]we define 
the data used for the cluster search in the SDSS and mock cata- 
logues. Section |3] describes the algorithm step-by-step. Section |4] 
describes the application and testing of the algorithm using real 
astronomical data, followed by a brief comparison with existing 
cluster catalogues in section |5] We describe the detection of mock 
clusters in simulated data in section |6] followed by performance 
tests on the simulated catalogues. In section |7] we summarise our 
findings. 

Throughout, we assume a ACDM cosmology with Jim = 0.3, 
= 0.7, 7^0 = 70 km s"^ Mpc~^ and h = Ho/100 km s"^ 
Mpc-\ For SDSS data we use the Sloan photometric system 
( |Gunn et al.|1998[ l and "model" magnitudes. 



tion, we select only galaxies where the offset between the r-band 
PSF and model magnitudes satisfies |rpsF — rmodci] > 0.05. We 
exclude bright (r,nodci < 14) galaxies and spurious sources such 
as overly de-blended galaxies and fragmented stellar haloes. 

Although no spectroscopic or photometric redshift estimates 
are used in detections, we post-process the cluster catalogue to es- 
timate the redshift of each system. Cluster galaxies are assigned 
spectroscopic redshifts by matching source positions in the SDSS 
DR7 Wigg leZ DRl prinkwater et al.|2010> and 2SLAQ l |Croom| 
et al.|200"9 i catalogues to within 1". Where spectroscopic redshift 



data is unavailable, we use SDSS DR7 photometric redshifts (see 
[Abazajian et al.|2 009 and references therein). To increase both the 
source catalogue redshift completeness and the redshift accuracy 
for galaxies with no spectra, we supplement these data with ad- 
ditional photometric redshifts. We select all galaxies later identi- 
fied by ORCA in the GMB 1 1 Stripe 82 catalogue and estimate their 
redshifts using the hyperz cod^(Bolzonella, Miralles & Pello 



|2000) with ugriz model magnitudes and errors. The SDSS Stripe 
82 input catalogue contains 11,358,087 galaxies with Galactic ex- 
tinction corrected ( [Schlegel, Finkbeiner & Davis|1998| griz model 
magnitudes, over -50° < a < 59° and S = ±1.25°. In this 
study, we concentrate on a 7 square degree sub-region within this 
catalogue, centred at {a, 5) — (355.52°, 0°) comprising 291,389 
galaxies (magnitude cuts applied to these galaxies for cluster detec- 
tion are discussed in j ^3.7.1^ . This sample, covering the same area 
as the mock survey described below, was considered a large enough 
observational dataset with which to test the algorithm. GMB 11 
describe findings from the ORCA catalogue based on the full 270 
square degree dataset. 

2.2 Mock Pan-STARRS Medium Deep Survey catalogue 



|Cai et al.1p009| l discuss the assembly of a light cone from the Mil- 
lennium Simulation fSprin gel et al.|20d5| l with a 3° opening angle, 
equivalent to a single pointing of the Pan-STARRS Telescope 1 
(PS-1), and the area of a single MDS tile. The Millennium Simula- 
tion provides the ACDM architecture into which galaxies are popu- 
lated using the Bower et ar] ( |2006^ semi analytic GALFORM model 
l |Cole et al. 12000 ). This creates a dataset with PS-1 grizy photome- 
try for 2,346,468 galaxies down to a magnitude limit of r < 27.5 
(equivalent to the expected 5cr depth for the PS-1 MDS) and a me- 
dian redshift of z — 1.05. The similarity of the PSl bands to the 
SDSS photometric system allows us to apply the same magnitude 
limits as those set for the Stripe 82 data (j ]3.7.1^ . 



3 THE METHOD 

In this section we first outline, and then detail the main components 
of the ORCA cluster detector. 



2 DATA 

2.1 SDSS Stripe 82 

We extract Sloan Digital Sky Survey Data Release 7 griz photom- 
etry for all sources with extinction-corrected (Schlegel, Finkbeiner 
|& Davis|1998[ > r-band model magnitudes r ^ 24 in the deep coadd 
stripe centred on the celestial equator ("Stripe 82") from the SDSS 
Catalog Archive Server (CASjI. To minimise stellar contamina- 



3.1 Algorithm Outline 

Here we describe the main steps of the ORCA algorithm. With pho- 
tometry in several bands, we calculate galaxy colours in consecu- 
tive {g — r,r ~ i, etc.) band pairs. 

1 We define a simple photometric selection using the colours 
and magnitudes of the sample. This selection could be simple, for 
example a narrow slice(s) in colour-magnitude space(s), or a more 



^ http://casjobs.sdss.org 



^ http://webast.ast.obs-mip.fr/hyperz 



4 Murphy et al. 




Figure 1. A depiction of tlie ORCA detector applied to a 9'x9' cut-out region of Stripe 82. Starting witli all galaxies in the box (first panel), a photometric 
selection ( §3.2) isolates galaxies within a specific redshift range (second panel); any clusters in this field will be evident as surface overdensities. In the third 
panel, we compute the Voronoi diagram (j ]3.4( of the distribution to estimate the surface density of remaining galaxies. These are separated into overdense 
(yellow) and underdense (grey) cells in panel four, according to how likely they are to belong to a random distribution (j |3.4) . In the final panel, we use a 
Friends-Of-Friends percolation algorithm ( §3.5) to connect overdense cells until the density of the whole system falls below a density threshold. Galaxies in 
the blue cells become members of a cluster if there are at least Nmin linked members. 



complex selection function. This selection function can be modified 
in successive applications of the algorithm to blindly scan the full 
photometric space, and thus isolate red-sequences across a range of 
redshifts ( |Gladders & Yee|2000|[2005) . 

2 In each pass of the algorithm, we apply the photometric se- 
lection to the catalogue, thus greatly restricting the total number 
of galaxies under consideration. In the case of using two colours 
concurrently, this can be a very effective means of reducing fore- 
and background contamination of a putative cluster characterised 
by some red-sequence. 

3 After the selection, we calculate the Voronoi diagram of the 
projected distribution of galaxies on the sky. The inverse of the area 
of each convex hull surrounding each galaxy can be used as an 
estimate of the local surface density. 

4 Galaxies residing in dense cells (satisfying some threshold cri- 
teria) can be connected together into conglomerations. If enough 
galaxies are joined together in this way, we define a cluster. 

5 In the blind scan, successive photometric cuts may select the 
same structures (since the adjustment of the selection is by design 
less than the typical width of a red-sequence). Multiple detections 
of the same structure are identified and reduced to a single detection 
(we discuss how this was implemented in §3.6[ l. 

An illustrative overview of the above procedure can be seen in 
Figure[T] 

3.2 Photometric filtering 

In large-scale imaging surveys, groups and clusters are apparent as 
overdensities in the projected distribution of galaxies. Cluster de- 
tection methods reliant only on determining the projected galaxy 
density distribution are often plagued by two problems: (i) projec- 
tion effects contaminating clusters with unassociated foreground 
and background galaxies (ii) the inclusion of spurious cluster de- 
tections arising from noisy data or chance projected overdensities. 

To mitigate these problems, the contrast of genuine clus- 
ters can be enhanced by applying a photometric selection filter 
in colour-magnitude space, to isolate the red-sequence ridge-line. 
We parametrise our selection as a slice in colour-magnitude space, 
defined by a colour-magnitude normalisation (Cm20, the colour at 
twentieth magnitude), slope /?(cm2o) and width o-(cm2o)- The ex- 
pected evolution of red sequence colours is constrained from sim- 
ple stellar evolution models, meaning scans over an appropriate 
set of photometric selection filters allows the isolation of clusters 
over a slew of redshifts. Figure |2] shows the redshift evolution of 




r-i colour 



Figure 2. The redshift evolution of the observed-frame r-i colour from a 
sample of mock galaxies. The colours indicate the density of galaxies at 
each point, with red being the highest. We are able to exploit this observed 
relation to isolate cluster galaxies within a specific redshift range by using 
a selection (such as the shaded strip in this Figure) to select galaxies from a 
nan'ow colour range. 

galaxy colours in a sample of mock galaxies from Merson et al. 
(2011, in preparation) and shows an additional advantage in us- 
ing such filters. The two tracks visibly demonstrate the bimodality 
in galaxy colour that manifests itself as the "red sequence" (lower 
track; Bower, L ucey & Ellis|1992l l and "blue cloud" (upper track). 
By selecting galaxies within specific colour range Ac (as denoted 
by the green region in the Figure), one may isolate red sequence 
cluster galaxies within the redshift range Az. Contaminants in this 
selection are bluer galaxies from higher redshifts. By simultane- 
ously selecting galaxies from two photometric selections in differ- 
ent colours, one can eliminate degeneracies between colour tracks. 
We discuss this further in the following section. 

The algorithm allows /3(cm2o) and (T(cm2o) to adopt any val- 
ues as the detector scans through colour-magnitude space. The sim- 
ple prescription we adopt is that of a fixed slope and width with nor- 
malisation. Although the observed-frame sequence slope is known 
to evolve with redshift ( [Gladders et al.|1998[ [Stanford, Eisenhardt] 
|& Dicldnso"nl|1998[ [Stott et al.||2009[ >, our choice of photometric 
selection width encompasses a range of sequence gradients large 
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Figure 3. An illustration of the Voronoi technique described in j ]3.4| The (left) panel is the Voronoi diagram of a random distribution of points. The (middle) 
panel is the equivalent diagram for galaxies in a field with the same mean density as the random field. The (right) panel shows the ratio of galaxy cell counts 
to random cell counts for a range of values of the integral distribution of cell areas (Equation[T|from Kiang T966i. There is a notable excess fraction of galaxy 
cells relative to random cells at low values of P(a), permitting the use of a threshold to separate clustered galaxies from field galaxies. 



enough to account for evolution as the algorithm searches to deeper 
redshifts. Analysis of mock clusters from the Millennium Simu- 
lation suggests this approach probes at least 2.5(1.5) magnitudes 
fainter(brighter) than the observed characteristic galaxy flux at the 
redshifts clusters are detected in this study. With measurements 
from a large ORCA cluster catalogue, future refinements to the al- 
gorithm may include a description of how the sequence slope varies 
with normalisation Cm20- The values adopted for /3 and a are dis- 
cussed in §3.7| 

We scan through colour-magnitude space in a colour Ca from 
blue to red, placing down a series of M photometric selection filters 
/(CaJ, /(Ca2)---/(Cam) by increasing the normalisatio n Cm20 
in small increments dc. The size of this increment, set in §3.7.11 
allows adjacent filters to overlap, ensuring clusters close to the 
boundary of a filter are well sampled. Because each photometric 
selection isolates cluster galaxies (where they exist) from a spe- 
cific redshift range, the detector can identify multiple clusters in the 
same line of sight. We determine the sensitivity of the algorithm to 
projection in |4.6.4| 

3.3 Dual-colour photometric filtering 

Although only one colour is necessary to detect clusters. Figure [2] 
notes the colour-redshift degeneracy apparent in attempting to iso- 
late a redshift regime from a single colour selection. One can break 
the degeneracy and further reduce the field galaxy contamination 
by identifying the colour range cluster members have in a second 
colour Cb, and subsequently applying a series of joint photomet- 
ric filters in both Ca and Cb - To establish the Cb colour range to 
scan, we take all cluster members from the preliminary detection 
(Ca only), de-trend their sequence slopes and fit a Gaussian to the 
colour distribution. The Cb colour range ACb is taken to be ilcr 
from the Gaussian mean. 

If the Gaussian fit is poor, detection of a clear sequence in both 
Cb and Ca is less likely. In this case ACb is simply ilcr from 
the median of the Cb colour distribution. The algorithm then scans 
over this second colour range and attempts to detect the cluster in 
both colours. 

A filter pair in Ca and Cb (hereafter {Ca,Cb}) requires a 
detectable sequence in both colours, and amplifies the cluster sig- 
nal by eliminating field galaxies in the Ca filter that fail to appear 



within the Cb filter. Any cluster in the final catalogue detected in 
Ca must therefore also have been detected in Cb- This improves 
the robustness of the algorithm and the reduction of contaminants 
from spurious detections. Because sub-filters overlap in Cb colour- 
magnitude space, the same cluster may be detected in multiple fil- 
ters. We apply the prescription described in §3.6| to identify and 
merge clusters that have been detected in more than one filter. The 
number of selection filters used to sample any colour range depends 
on the sampling interval dc set in §3.7. 1| 

3.4 Identifying overdensities with the Voronoi tessellation 

After increasing a cluster's detectability by suppressing field galax- 
ies with photometric filters, the next step is to calculate the local 
surface density of each galaxy. Galaxies residing in common re- 
gions of enhanced density can then be grouped together into clus- 
ters. To quantify the surface densi ty field, we divide the galaxies 
into Voronoi cells using qhullP] ( [Barber, Dobkin & Huhdanpaa 
|1996[ l. The Voronoi diagram is a tessellation of convex hulls, or 
cells, with each galaxy occupying only one cell. All positions in- 
side a given cell are closer to the cell's nucleus (the galaxy) than any 
other. Unlike many other detection techniques, the Voronoi Tessel- 
lation (for VT cluster detection, see pbeling & Wiedenmann|1993[ 
[Ramella et al.| 200 1^ does not smooth the data, is robust to clus- 
ter ellipticity ( [Plionis, Barrow & Frenk|1991| > and can be applied 
to a variety of survey geometries. VTs do not suffer from spurious 
detections around survey boundaries and edges, and are thus well 
suited to analysing astronomical data with localised camera defects, 
excised bright stars and other sources of incompleteness. The left 
and middle panels of Figure |3] respectively show the Voronoi dia- 
grams for a random point distribution and galaxies with identical 
mean densities E. Galaxies in more concentrated regions tend to 
have smaller cells. 

We define the reciprocal of the galaxy cell area (ag) as an 
estimate of the galaxy's local surface density Eg. Searching for 
connected regions of high density identifies statistically significant 
structures. To determine if a galaxy resides in a high density region 
of the survey, we evaluate the statistical significance of finding a 

http://www.qhull.org 
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Figure 4. A sequence of Voronoi diagrams generated from galaxies in the same area of sky, but selected from different photometric filters. A cluster signal is 
apparent for some filters, but is not apparent in others. This demonstrates the power of colour selection in isolating galaxies at specific redshifts. In cases where 
a cluster may be detected in more than one filter (such as the borderline detection in the second panel), the algorithm must decide which cluster to select. This 
aspect of the detector is discussed in i ]3.6| 



cell of area ag in a random field with mean cell area aR. We use the 
|Kiang| ^1966^ cumulative function for a Poissonian distribution of 
points: 

P(a) = ^'"dp = l-e-^'' (^^+8a^ + 4a + l) (1) 

where a — (ag/aR). The right panel of Figure [5] shows the 
distribution P(a) for cells in an example galaxy field relative to a 
Poisson distribution of the same field size and number of points. 
Candidate cluster galaxies residing in overdense regions can be 
selected by cell areas statistically unlikely to arise in a random 
distribution. An excess of galaxy cells is apparent for low P(a) 
compared to the random distribution. We identify all galaxies with 
P(ag) < Pthrcsh in order to select a population of clustered galax- 
ies. The choice of overdensity probability threshold is discussed in 

3.5 Connecting overdense regions to form clusters 

Remaining galaxies belonging only to overdense cells are now 
grouped together to form clusters. We achieve this by applying a 
Friends-Of-Friends algorithm to these cells. Rather than a distance 
criterion, we define a "friend" as an adjacent Voronoi cell sharing at 
least one vertex. Potential clusters are seeded by ordering the cells 
with decreasing density, iterating through and connecting adjacent 
cells. These overdense regions grow by percolation until either no 
more adjacent overdense cells remain, or the mean cell density of 
the putative cluster: 

Scclls ~ Ngal — < Ecrit (2) 

i=l ' 

Groups of connected galaxies are classified as clusters if they 
have Ngal ^ Nmin- The choice of the critical density threshold 
Scrit and Nmin algorithm parameters is discussed in p.7.1| 

3.6 Producing a cluster catalogue 

In §3.2| and j ]3.3| we noted that adjacent photometric filters applied 
to the input catalogue overlap in colour-magnitude space. With this 
sampling strategy, the same cluster could be detected in multiple 
filters. Figure [4] shows a sequence of Voronoi tessellations applied 
to the same area of sky under photometric filters sensitive to differ- 
ent redshift ranges. Because colour scans sample the colour range 
of a red-sequence at a specific redshift, the cluster will be detected 
in multiple scans (with a peak contrast where the selection is most 
effective). In cases of clusters detected multiple times in different 



photometric filters, the "best" cluster is identified and added to the 
final cluster catalogue. 

For two candidates to be considered detections of the same 
system, they must have sufficiently similar spatial positions, red 
sequence fits and cluster members. We quantify the similarity in 
cluster sequences using linear fits to the colour-magnitude relation 
for the galaxies in each cluster detected. Sequence slopes can in 
principle adopt any value permitted by the width of the photomet- 
ric filter (defined here as ct/) it was selected in. We quantify the 
similarity between two sequences with the following criteria: 

- Sequence match 1 (ASi): True if the sequence separation is < 
0.5(7/ in colour for at least 25% of the magnitude range ttibcg 5; 
m < rriBCG + 5. 

- Sequence match 2 (AS2): True if the sequence separation is 
< (7/ in colour difference for at least 50% of the range described 
in ASi. 

- Sequence match 3 (AS3): True if the colour difference at 20**^ 
magnitude, (Acm2o) between the two sequences is < cr/. 

- Sequence match 4 (AS4): True if the clusters were detected in 
adjacent (overlapping) filters. 

To define the similarity in cluster membership, spatial position 
and extent, we describe the common-galaxy fraction and projection 
extent for two clusters, CLi and CL2: 

- Common galaxies (cgi,2): the fraction of galaxies in CLi that 
also belong to CL2. Similarly, cg2,i is the fraction of CL2 galaxies 
also appearing in CLi. The BCGid boolean notes when clusters 
share the same BCG. 

- Projection extent (pei,2): the fraction of galaxies in CLi that 
lie within the Voronoi cell boundaries of the CL2 cluster. As with 
eg, pe2,i is the case for CL2. 

With these measures, five tests of "cluster similarity" were de- 
vised (Table[T](. A pair of clusters must pass at least one to be con- 
sidered detections of the same system. Each of these tests account 
both for the spatial and colour characteristics of the clusters. Be- 
cause no merging can proceed purely by colour similarity or spatial 
coincidence, this ensures the separation of associated but distinct 
systems, and clusters in projection. We balance these requirements 
with the need to prevent multiple instances of the same cluster ap- 
pearing in the final catalogue. Where matches between two clusters 
exist, the thresholds in Table[T|make it likely the two systems will 
be merged. 

To define the "best" cluster from a list of candidates, we pick 
out the system with the largest reduced flux - the total flux (in the 
detected band) of all but the three brightest cluster members. This 
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Figure 5. (Top) Colour-magnitude diagrams for tlie 126 Abell 2631 members selected in this study. The yellow dot notes the position of the cluster r-band 
brightest cluster galaxy. The black lines denote photometric selection filter fits to the data and indicate the slope (13), normalisation (solid, c^2o) ^nd width 
(dotted, a). The identified members are split into those inside (blue) and outside (red) the 3-sigma cut used to estimate the filter width. Grey data indicate all 
galaxies that were not identified as members of the cluster out to a radius of 7-arcminutes from the cluster centre. The red dashed line in the g-r colour indicates 
the blue limit imposed by the Virgo cluster, and the equivalent lines in r-i and i-z denote the lowest Cm20 identified from cluster sequences in our search of the 
7 square degree Stripe 82 survey. {Bottom) The colour-magnitude diagrams for galaxies in a region of the same area located in a field environment. 



Table 1. The set of conditions used to consider whether two clusters are 
multiple detections of the same system. If any one of these conditions are 
satisfied, the algorithm picks the "best" cluster of the two. 



# Constraint 

1 (cgi,2 0Rcg2,i)> 0.5 

2 (pci,2 ORpe2,i)> OAND ASi 

3 BCGid AND AS2 

4 (pei,2 0Rpe2,i)> 0.8AND AS3 

5 (pei,2 ORpe2,i) > 0.8 AND AS4 



prevents the selection of a cluster including one or two bright galax- 
ies that may not be genuine members, but also makes the choice 
of best cluster largely independent of the BCG. Once the "best" 
cluster is selected, the remaining candidates are discarded from the 
catalogue. However, to each cluster selected in this way, we attach 
a record of the candidate cluster galaxies that were not selected, 
forming an auxiliary catalogue of associate cluster members. In 
this way, we can keep track of galaxies the detector considered as 
members but did not include in the cluster. The degree of over- 
sampling in colour space and hence number of multiple detections 
depends on the sampling interval dc, relative to the width (T(cm2o) 
of the filter. We set both of these parameters in j ]3.7.1| 



3.7 Algorithm parameters 

This section defines the values adopted for the algorithm parame- 
ters described in g3.2| - p.5| 

3.7.1 Photometric filtering 

In both mock and real datasets, we limit our search for clusters to 
three colours: g-r, r-i and i-z. These are used to form joint selection 
filters combining two colours: {g-r, r-i} and {r-i, i-z\. 

Each photometric filter is described by a colour normalisation 
Cm20, slope /3(cni2o) and width (j(cin2o)- For this study and that 
of GMB 1 1 we demonstrate the detector with an unchanging filter 
slope and width. In order to set /3 and a for each colour, 126 mem- 
bers of Abell 2631 I Abell, Corwin & Olow in 19891 are visually 
identified in an r and g composite Stripe 82 image. At redshift 
z = 0.278 l |B6hringer et al.|2000) , this system is the richest Abell 
cluster in Stripe 82 and shows evidence of a clear sequence in all 
three colours used in this study. 

A linear fit to the colour-magnitude sequence was applied to 
determine /3 for each colour. The filter widths were set using a 
method akin to that described in |Gladders et al.|jl998^ ; we first 
remove the slope in each sequence and then exclude 3(j outliers. 
Starting at the line fitted to the cluster sequence, we increase the 
width in equal amounts above and below this line until we enclose 
90% of the remaining members. We define this as the filter width a 
for that colour. 

Figure |5] shows the colour-magnitude sequence of the identi- 
fied members in the three colours (top) compared to a field of the 
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r-band magnitude 

Figure 6. The SDSS model r-band photometric error in a sample of 100,000 
Stripe 82 galaxies. These data are used to set a magnitude limit where at 
least 50% (0.68(t, black horizontal dotted line) of the faintest galaxies re- 
main in a colour slice of width aj = 0.152. Whilst the data suggest a limit 
of r 23.8, we opt for a slightly more conservative r ^ 23.5 limit (red 
vertical dashed line). 



same area with no cluster present {bottom). Blue (red) points iden- 
tify members that were inside (outside) the 3(j cut used to identify 
outliers. Grey data coiTespond to galaxies that were within 7' of the 
cluster centre and not picked as cluster members. Table [2] lists the 
fitted filter parameters for each colour (corresponding to the black 
lines in Figure |5| in addition to the colour range and number of 
filters used in our cluster search. Following our decision in §3.2| to 
use a fixed slope, we adopt the largest filter width (ct/, 0.152) for all 
colours, and use this to define the input galaxy magnitude limit for 
each band. Magnitude limits are applied to reduce the number of 
input galaxies with high levels of photometric uncertainty. We set 
these as the faintest magnitude where the photometric uncertainties 
fall below 0.68(7/ 

We set limits for each band based on a sample of 100,000 
galaxies from Stripe 82. Figure|6]shows the galaxy photometric er- 
ror distribution for the r-band, and from this we set a magnitude 
limit of r ^ 23.5. This is slightly more conservative than the limit 
implied by the error distribution (r ^ 23.8) because we aim to 
include only sources with good photometry. The magnitude limits 
applied are 24.0, 23.5, 23.3, 21.6 in the g, r, i and z bands respec- 
tively, resulting in a source catalogue of 69,797 galaxies. With the 
added depth from Stripe 82 photometry, these limits permit an ex- 
ploration of the red sequence to at least 2.5, 3 and 1 .5 magnitudes 
fainter than M* respectively for the r, / and z bands. As part of the 
algorithm design, we considered multiple searches through the data 
at different flux limits. Under this prescription, higher-signal cluster 
sequences would be selected when re-detections of the same sys- 
tem were merged. In tests with the mock lightcone data analysed in 
!|6] we found no significant advantage in this implementation, and 
instead kept our magnitude limits fixed. 

The bluest filter pair we employ is g-r. To prevent the detec- 
tion of spurious systems bluer than the 2 = red-sequence in 
this colour we determine a blue limit by extrapolating the colour- 
magnitude relation (CMR) for Coma ( [Smith et al.|2009) and Virgo 
( |Rines & Geller|2008| > to r= 20. The Cm20 normalisation for Coma 
(Virgo) was estimated as 0.6 (0.47); we use the latter as the bluest 



Table 2. Filter parameters fitted from Abell 2631, the ranges searched and 
the number of filters in each colour. The blue limit in g-r corresponds to an 
extrapolation of the Virgo CMR, whilst the others permit a full sweep of the 
available data. The emboldened figure is the largest filter width (uf ), and is 
adopted for all colours. 



Colour 


Slope (/3) 


Width (cr) 


Range 


Filters 


g-r 


-0.048 


0.152 


0.47- 2.00 


39 


r-i 


-0.017 


0.067 


0.00 - 1.22 


38 


i-z 


-0.023 


0.110 


-0.10 - 1.10 


31 



filter possible in the g-r colour. We do not apply similar limits to 
the other colours, but the normalisation below which no sequences 
were detected in r-i and i-z is described in j ]4. 1| Figure |5] shows 
these limits as red dashed lines. 

Finally, the detection algorithm uses photometric filters that 
overlap in colour-magnitude space, preventing clusters close to fil- 
ter edges from being poorly sampled. A sampling interval in colour 
space of dc — 0.04 is chosen, corresponding to an overlap of ap- 
proximately 75% between adjacent filters based on a/, the filter 
width. 



3.7.2 Voronoi Tessellation and connection of overdense regions 

The initial identification of clusters in projected high density re- 
gions and the subsequent percolation of their members depends 
respectively on the probability threshold Ptiircsh and the critical 
density Ecrit- We parametrise the critical density Ecritas a scalar 
multiple of E such that both detection parameters have a mean den- 
sity dependence. In the left-hand sequence of Figure|7] we note the 
effect a range of (Pthresh,Ecrit) combinations have on the recovery 
of Abell 2631 within a box of scale 13.6'. By tracking the detec- 
tor's assignment of Voronoi cells to cluster and field, we compare 
members visually identified to the recovery of this cluster under 
different parameter combinations. The cells are colour-coded into 
four groups to differentiate detected and visually identified mem- 
bers. Grey cells show galaxies neither detected nor identified as 
cluster members. Green cells denote detected members that were 
also visually identified, orange for where the detector did not as- 
sign cluster membership despite our classification as such from the 
imaging, finally red cells are detected members not visually identi- 
fied as members. We stress the latter group in no way indicates the 
purity of the cluster, as we are both incomplete and subjective in 
our identification of genuine cluster members. However, this exer- 
cise does provide a useful indication of detector performance when 
compared to our visual impression of cluster membership. 

The detection grids show re-detection is broadly insensitive to 
the range of parameters explored. At higher probability thresholds 
(increasing row number) the cluster expands to form a more ex- 
tended structure. This growth is moderated by the introduction of 
a ininimum cell density. We exclude Ecrit= 20E as it removes a 
significant fraction of visually identified members on the periph- 
ery of the cluster. The middle ground between detecting a more 
compact system (Pthrcsh=0.005) and potentially increasing the in- 
terloper fraction (Pthrcsh=0.015) suggests the balance of detec- 
tion completeness and cluster purity lies with Pthicsii=0.01. We 
note from Figure [3] there are at minimum twice as many clus- 
tered cells as unclustered at P{a) ^ 0.01. Although (0.01,0E) 
and (0.01, lOE) appear identical in their recovery of the cluster, 
we require a non-zero density constraint to filter out spurious low 




Figure 7. Effect of detection parameters on Abell 2631 {left, box scale 13.6' X 13.6') and a compact group (right, box scale 3.5' X 3.5'). Colour key: Grey are 
cells with field galaxies, green are galaxies identified by the algorithm that were also visually identified as members. Red cells are members assigned to the 
cluster by the detector but not visually identified as cluster members. Orange cells are galaxies that failed to be correctly identified by the algorithm as cluster 
galaxies, but were defined as such visually. The circle around Abell 2631 corresponds toal/t^^Mpc radius at the cluster redshift. 



amplitude systems and prevent large clusters from percolating into 
giant connected structures. We consequently adopt the parameter 
combination (Pthreah,Scrit)=(0.01,10E). To ensure these parame- 
ters are not biased to the detection of high mass systems, we use 
1 1 members of a visually identified compact group to perform a re- 
detection in the same parameter ranges. The right-hand sequence 
in Figure [t] with boxes of scale 3.5', shows the recovery of this 
group, and indicates group scale detection is robust to the range of 
parameters explored. The trade-off between completeness and pu- 
rity is similarly evident here, with (0.01, lOS) remaining a good 
compromise between the two. 



In both cases (and more generally) there is a tendency to un- 
derestimate the total number of cluster members. This arises from 
an inevitable feature of Voronoi Diagrams implying the algorithm 
is unlikely to recover all cluster members. The suppression of the 
field galaxy population with photometric filters causes an abrupt 
drop in galaxy surface density at the cluster boundary. Because the 
Voronoi cells of peripheral members have a limited number of field 
galaxies to constrain their boundaries they adopt larger areas. Such 
cells may then be rejected as members because their areas are in- 
consistent with that population. Nevertheless, tests with mock cata- 
logues allow us to quantify the impact this effect has on the cluster 
purity, as discussed later in ^ 



4 SDSS EQUATORIAL STRIPE 82 CLUSTER 
CATALOGUE 

4.1 The catalogue 

We applied the detector to a 7 square degree sample of Stripe 82, 
using the limits described in Sj2]and parameters described in j ]3.7| 
Here we describe the general characteristics of this catalogue, per- 
form a series of tests on the data and briefly compare our detections 
to existing optical and X-ray-detected clusters. 

After applying the magnitude limits described in §3.7. 1[ a 
source catalogue of 69,797 galaxies is analysed by the algorithm. 
We find a total of 97 clusters, identifying a total of 1293 clus- 
ter galaxies (0.5% of the original galaxy sample) and 813 asso- 
ciate cluster members (candidate cluster members that were not se- 
lected). Of these clusters, 34% were detected in {g-r, r-i} and 66% 
in the {r-i, i-z\ combinations. 

Although we define a blue limit for the g-r colour-magnitude 
relation (Cm2o > 0.47), equivalent limits were not applied to the 
r-i and i-z colours. We can however place upper bounds on the blue 
limit in these colours by noting no clusters were detected below r- 
i=0.24 and i-z=0. 18. Such limits serve to reduce the search time for 
future survey scans. 

Table[3]shows an extract of the cluster catalogue. This 7 square 
degree sample of 97 Stripe 82 clusters is available onlin^ Each 
cluster is named according to the lAU convention, in the form MGB 
JHHMMSS-l-DDMM.m. We detail below the main features both 
catalogues. 



Finally, we set the minimum membership of a cluster, Nmin, 
to five galaxies. ^ http://orca.durac.uk/ 
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Table 3. A sample of the ORCA cluster catalogue generated in this study. Full details of the columns can be found in i ]4.1[ j ]4.5] The first column contains 
the cluster name based on the lAU convention. Columns 2 and 3 note the J200() estimated cluster positions in degrees. Columns 4 and 5 describe the cluster 
redshift and source data used to calculate the redshift. Column 6 notes how many members were found in the cluster, and we provide estimates for the cluster 
Bgc richness and sequence scatter in Columns 7 and 8. The final two columns indicate the radius (in degrees) enclosing 80% of the cluster members and the 
ratio of this value to the 20% radius, a measure of cluster concentration. 



Name 


RA 


DEC 


cluster_z 


czJype 


Ngal 


b-gc 


scatter 




c 


MOB J2340 17-00030.9 


355.06912 


-0.06455 


0.245 


C0s0w0q0d0b0p6h2 


6 


19416 


0.047 


0.0001 


1.700 


MGB J233817+00190.0 


354.56897 


0.33309 


0.208 


C0s0w0q0d0b0p8h6 


8 


94461 


0.038 


0.0003 


3.667 


MGB J234 11 3-00000.4 


355.30349 


-0.00597 


0.166 


C0s0w0q0d0b0p6h2 


6 


182181 


0.018 


0.0003 


1.692 


MGB J234400-00300.3 


355.99952 


-0.50461 


0.181 


C0slw0q0d0b0p5h4 


6 


71831 


0.025 


0.0001 


1.750 


MGB J234725+00 190.7 


356.85322 


0.32867 


0.201 


C0s0wlq0d0b0pl4hl4 


14 


10967 


0.037 


0.0004 


2.545 



4.2 Cluster positions & redshifts (cluster_z, cz.type) 4.3 Cluster richness (b_gc) 



The ra and dec position quoted in the catalogue is the algorithm 
estimate of the centre of each cluster, based on the average positions 
of their members. 

Although we do not use any redshift data to generate our clus- 
ter catalogue, we provide redshift estimates for each system de- 
tected by the algorithm. These redshifts are weighted towards mem- 
bers with spectroscopic data, but two sets of photometric redshift 
data (hyperz and the DR7 photometric estimate) are used to provide 
each cluster galaxy with at least one redshift estimate. From the 
catalogue of 1293 cluster galaxies, 2.6% have spectroscopic data 
(DR7 spectroscopic redshifts, WiggleZ and 2SLAQ), 93% have 
DR7 photoz and 87% have hyperz estimates. The hyperz es- 
timates for cluster members were generated using only SO and E 
SEDs, a jCalzetti et al.|j2000^ reddening law and a two-stage con- 
vergence (over and above that performed by hyperz) to the red- 
shift where a range identified in coarse redshift bins is re-sampled 
with a smaller bin width. Comparing these estimates to available 
spectroscopic redshifts, the measured error dispersions are higher 
in hyperz than in the DR7 pipeline (0.029 vs 0.016). 

We calculate each cluster redshift by determining the weighted 
median redshift from the available member data. The weighting 
for a spectroscopic, DR7 photoz and hyperz redshift is 4, 2, 1 
respectively, the higher weighting for DR7 photoz reflecting the 
smaller error dispersion mentioned above. To gauge the accuracy 
of our redshift estimate, we note the calculated redshift of Abell 
2631 is z = 0.26, some 0.02 lower than the value determined by 
[Boliringer et al.| j2000[ >. The median cluster redshift of the whole 
catalogue is Zmed = 0.31, and the maximum redshift is z = 0.57. 
Approximately 25% of the clusters have at least one member with 
a spectroscopically measured redshift. 

Without access to spectroscopy, accurate photometric red- 
shifts of red sequence cluster galaxies are good measures of cluster 
redshifts. We quantify this in Figure[8]by comparing the photomet- 
ric and spectroscopic redshifts of cluster BCGs from a sample of 
the full GMB 1 1 Stripe 82 cluster catalogue with spectroscopic red- 
shifts. After removing a small systematic trend and 3cr outliers, the 
la dispersion in {zs — Zp)/1 + Zs is 0.0157 (increasing to 0.0163 
when ignoring the systematic error). This suggests BCG photomet- 
ric redshifts are accurate estimates of the cluster redshift. 

The cz_type property is a shorthand description of the avail- 
able redshift data for each cluster, each letter defining a measure- 
ment type, followed by the number of that type. The letters denote 
data from the mo(c)k, DR7 (s)pectroscopic, (w)iggleZ, 2SLA(q), 
DR7 (p)hotometric and (h)yperz datasets, where mock is of course 
not used in this observational data. 



With access to cluster redshifts we are able to calculate the Bgc 
optical cluster richness, a robust parameter known to correlate with 
cluster mass. We use the Bgc measure described in |Yee & L6pez-| 
[Cruzl ( fT999) : 

PbgD(2cl)^-' 



■p ^„g^^-ciy Age 

""^ I.,<l>(M3,M3+3,Zcl) 



(3) 



where pbg is the background surface density of all source cat- 
alogue galaxies (irrespective of their colour) inside a 0.5ft~^Mpc 
radius with luminosities between the third brightest cluster galaxy 
(Ma) and three magnitudes fainter The integrated luminosity func- 
tion, $(M3,M3 + 3, 2ci), is measured over the same luminos- 
ity range. We evolve the z=0.1 [Blanton et aT] (2003| l SDSS r- 
band luminosity function (<;/>*=1.49x 10~^, M^,=-20M, a=-1.05) 
using the prescription described in [Lin et ?I] ( |1999 ) that adds 
redshift-dependent terms to and M* with parameters P=-1.06 
and Q=1.82. D, the angular diameter distance, is derived from the 
cluster redshift z^. 7 and respectively define the slope of the an- 
gular galaxy correlation function and the integration constant aris- 
ing from de-projecting the cluster We set these to 7 = 1.77 and 
1^, — 3.78. The correlation amplitude Age is defined as: 



Ag 



Nnot (3-7) 

Nbg 2 



,7-1 



(4) 



where Nnot is the background-corrected count of galaxies 
within the luminosity range described above, out to an angular 
separation 6 that corresponds to 0.5/i~^Mpc at the cluster red- 
shift. Nbg is the background galaxy count within this radius, es- 
timated from the mean density of galaxies across the whole field. 
The full 270 deg"^ Stripe 82 catalogue provides additional defini- 
tions of cluster richness - we refer readers to GMB 1 1 for the details 
of those measurements. 



4.4 Cluster sequence scatter (s catter) 

To estimate the width of a detected cluster's sequence, we first 
make a fit to the slope of the sequence and remove the tilt. Using 
cluster members between mecG ^ m ^ rriBCG + 3, we estimate 
the sequence scatter by making a 2a clip in the colour distribution. 

The robustness of the red sequence fit is sensitive to the num- 
ber of members in the detection. Based on a bootstrap-resampling 
of the cluster sequences, we find the fitting procedure is robust in 
clusters with at least 8 members. Below this, sequence scatter esti- 
mates are dominated by fitting uncertainty. For systems of at least 
10 members, the characteristic error in the sequence scatter is 34%, 
dropping to 19% for clusters with up to 30 members and 8% for 
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Figure 8. Comparison of photometric redshift accuracy 5z (zs) = (^^s — Zp) / 1 + Zs for the cluster BCGs with spectroscopic redshifts. After outlier rejection 
(clipping galaxies with |(5z | > 3o"^^, or 0.4% of the total sample) and removing the slight systematic photoz error, we find a l-cr scatter ct^-^ =0.0 157 (denoted 
by the dotted blue lines). This highlights the excellent redshift recovery using ugriz photometry alone. For a given cluster we combine both the photometric 
and (where available) spectroscopic redshifts of cluster members to derive a robust redshift estimate for the system as a whole. 



those with at least 50 members. Future catalogues will provide im- 
proved estimates of the sequence-fitting error. 

4.5 Projected scale (&8o) & concentration (C) 

For each cluster, a projected scale size O^o is provided. This is cal- 
culated as the angular radius (in degrees) enclosing 80% of cluster 
members from the centre. 

A measure of the projected concentration (C) is determined by 
comparing the radius enclosing 80% of the cluster members to the 
radius enclosing 20%. High values of ^80/^20 indicate a centrally 
concentrated cluster. 

4.6 Testing the algorithm 

4.6.1 Cluster re-detection robustness 

To determine how robust the detector is to catalogue incomplete- 
ness, we attempt re-detections of the Abell 2631 cluster after re- 
moving a random selection of members from the source data. Our 
sole constraint is that the cluster BCG remains in the source data. In 
the following analysis, we only consider the detected cluster clos- 
est to the original Abell 2631 position. Robustness is defined as 
the fraction of members detected in the new cluster from those re- 
maining in the input catalogue. We use a test g-r photometric fil- 
ter that adopts a Pg~r, Cm20 and (Jg-r best suited to the recovery 
of A2631, selecting approximately 85% (108) of the visually se- 
lected members. We experiment with removal fractions down to 
95%, corresponding to the largest fraction still retaining Nmiii=5 
original cluster members in the sample. 

Fifty random realisations of a depleted input catalogue are 
generated for each removal fraction, yielding a median recovery 
rate based on members that could have been added to the clus- 
ter. The solid blue line in Figure |9] shows how increasing the re- 
moval fraction affects the fraction of cluster members recovered; 
error bars on this line represent la uncertainties from the 50 re- 
detections in each bin. The recovery fraction when no galaxies have 
been ejected is ~ 93% of the 108 A2631 members located inside 
the photometric filter. The other 7% were rejected by the algorithm 
because either their Voronoi cells have insufficient densities to join 
the overdense collection of cells (Pthrosh, see §3.7.2[ l, or their inclu- 
sion causes the percolating cluster to drop below the critical density 

(Scrit). 

We take into account this intrinsic detection inefficiency, quot- 
ing yields from the cluster re-detection relative to the ~ 93% of 
members recovered where no additional galaxies are removed. Un- 
surprisingly, the fraction of detected members located in the clus- 
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Figure 9. The recovery fraction (solid line) and recovery accuracy (dotted 
line). Some Abell 2631 cluster galaxies are randomly removed from the 
source catalogue, and the fraction subsequently identified in a re-detection 
of the cluster is the recovery fraction, with error bars of Icr uncertainty cal- 
culated from 50 re-detections. The fraction of visually identified Abell 2631 
galaxies making up the re-detected cluster defines the recovery accuracy. 
The fraction required to produce an Nmin=5 member system is denoted by 
the black dashed line. 



ter drops as more members are excised. However, over 75% of re- 
maining members are re-detected even after half of the cluster is 
removed. Approaching larger removal fractions, the fragmentation 
of cluster members into spatially distinct groups hinders recovery 
of the complete set. The black dashed line in this plot corresponds 
to the minimum recovery fraction required to identify Nmin=5 orig- 
inal members from the input data. The algorithm can robustly iden- 
tify the original cluster down to an 80% removal fraction, corre- 
sponding to 22 of the original 108 galaxies. Below this limit, an 
insufficient number of cluster members are recovered by the detec- 
tor to identify a cluster associated with the halo. 

For each ejection fraction we also calculate the recovery ac- 
curacy: the fraction of visually identified A263 1 galaxies making 
up the re-detected cluster. The dotted blue line in Figure |9] shows 
this parameter. The initial accuracy (no members are removed) is 
approximately 60%, providing some estimate of our level of incom- 
pleteness when visually identifying cluster membership. As more 
members are removed, there is a gradual reduction in accuracy, im- 
plying replacement of these members with other galaxies becomes 
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Figure 10. The algorithm's re-detection capability when a cluster has been 
moved to a random position. The recovery efficiency (solid blue line) is the 
fraction of original cluster galaxies found in the displaced cluster. The edge- 
effect recovery efficiency (red line) shows a similar test, instead moving the 
cluster to a random position near the survey boundary. Uncertainties in both 
lines are Icr errors from 50 re-detections. The recovery accuracy (dotted 
blue line) is the ratio of input cluster members to the member count of the 
re-detected cluster. The black dashed line indicates the Ninin=5 thi'eshold 
required to secure a robust detection of the cluster's halo. 



more commonplace. At large (> 70%) removal fractions, fragmen- 
tation acts to reduce the connectivity of cluster members, increas- 
ing the number of contaminant galaxies that share the photometric 
filter. 



4.6.2 Cluster displacement and edge ejfects 

A cluster detector should identify systems irrespective of the pro- 
jected environment they are located in. Ideally then, recovery of 
identified members is achieved even if the system is moved to an- 
other position. 

To determine the sensitivity of cluster identification to lo- 
calised background fluctuations, we shift source data positions of 
known cluster members to a random location, keeping their spa- 
tial distribution intact. A buffer is created around the survey edge 
to ensure no cluster members are displaced outside the boundaries, 
then a re-detection of the cluster is attempted. The re-detection per- 
formance is quantified by the recovery efficiency - the fraction of 
original members in the new cluster, and the recovery accuracy re- 
mains as defined in the previous test. 

Figure [To] shows the recovery efficiency {solid blue) and re- 
covery accuracy {dotted blue) for clusters spanning more than an 
order of magnitude in membership (Nmin=5 to 174 galaxies). If 
there was a choice of cluster for a membership bin, we used the 
system with the smallest sequence scatter to determine the impact 
of displacement on the best candidate in that membership group. 
Each cluster was re-detected in the pair of selection filters it was 
originally identified in, meaning a re-detection with no displace- 
ment would yield a perfect recovery efficiency and recovery accu- 
racy (both equal to unity). We perform 50 random displacements 
for each of the selected clusters, using their scatter to derive io un- 
certainties from the mean. The black dashed line in Figure[TO|corre- 



sponds to the recovery fraction required to detect Nniiii=5 galaxies 
of the original system from each displaced cluster. 

For the majority of cluster sizes, recovery accuracies are ap- 
proximately constant at ~ 90%, meaning 10% of the cluster mem- 
bers are background galaxies selected in the same photometric se- 
lection. Recovery efficiency data suggest the detector makes sig- 
nificant cluster re-detections for systems down to 10 members, 
but smaller groups are susceptible to higher levels of contami- 
nation and fragmentation. Our example case of Abell 2631 (at 
logioNgai ~ 2.1), with a recovery efficiency of 80% is approxi- 
mately 13% lower than the recovery fraction from robustness test 
calculated above. A recovery accuracy of ~86% is consistent with 
the detector swapping 13% of original members with background 
galaxies when the cluster is moved. 

We next establish how survey edges bias the detection of sys- 
tems at the boundaries. Using the same set of clusters, we repeat the 
above experiment, specifically placing systems close to the survey 
edges to quantify the impact of edge effects on group and cluster re- 
covery. When moving each cluster, we ensure no members are out- 
side of the survey boundary. The average separation between sur- 
vey edge and the member furthest from the cluster centre is around 
23 arcseconds. 

Galaxy cells at the boundary of a Voronoi Diagram are un- 
bounded, often resulting in very large cell areas. This may ham- 
per the identification of low-membership clusters, where a member 
with cell area exceeding the probability threshold may preclude the 
cluster from detection. Random positions are selected along any 
one of the four sides of the survey (allowing clusters to reside in 
a comer). In our source catalogue, the declination boundaries (at 
5 — ±1.25°) are set by the geometry of the stripe, whilst the RA 
boundaries are artificially defined. Distances between the cluster 
centroid and survey edge are large enough to include all members 
within the survey. The red line in Figure [TO] shows the recovery 
efficiency based again on 50 randomised displacements. This dis- 
tribution is very similar to that of the displacement test above, sug- 
gesting edge effects do not hinder the recovery of clusters any more 
than the displacement of the members themselves. This is particu- 
larly significant at group scales, where the exclusion of one or two 
members could prevent the detection of the system. 



4.6.3 False positive detection rate 

We set the detector the task of attempting to detect spatially clus- 
tered systems with randomised colours. This establishes the impor- 
tance of red sequences to cluster detection with this algorithm and 
provides an estimate of the false detection rate. We run the detec- 
tor on the source catalogue in the same manner as before, having 
first shuffled the colours so while cluster members still reside in 
high surface density regions, they no longer have red-sequences. 
We identified two "clusters" (with 5 and 6 members) in the 7 
square degree survey, both located at the positions of original high- 
membership ORCA clusters. To ensure this calculation is uninflu- 
enced by the size of the survey, we repeat this process on the full 
Stripe 82 dataset (—50° < a < 59°) covering 270 square degrees. 
The algorithm detects 15 "clusters" from these data, each consist- 
ing of five or six-member groups. From this we infer the number of 
spurious systems detected per 7 square degrees is 0.39. 

In a similar fashion we next randomise galaxy positions while 
keeping the colours the same. This means cluster red-sequences re- 
main intact as the algorithm scans through colour-magnitude space, 
but points clustered in colour are no longer clustered in the sky. The 
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algorithm detected four "clusters" over the full 270 square degree 
Stripe 82 dataset, implying a ~0.1% spurious cluster detection rate. 

Both exercises suggest the detector cannot identify clusters 
without correlations in both colour and spatial position. Moreover 
the probability of detecting systems based on random distributions 
of both colour and position is below 1%. 



4.6.4 Projected cluster-pair resolution 

The ideal algorithm can identify two clusters with the same an- 
gular position on the sky, but at different radial distances. Using 
the Cm20 — z relation demonstrated in Figure|2] one can in princi- 
ple isolate superimposed systems by identifying them in different 
filters. Within a detection filter /(Ca) of width af, two spatially 
coincident systems will be merged even if their sequences do not di- 
rectly overlap. We overcome this limitation by splitting sequences 
in the following colour (C'b) with the application of joint filters 
( |3.3[ l. The resolving power of the algorithm in projection is there- 
fore limited by the merging of separate clusters that are mistaken 
as multiple detections in p.6| 

We test this effect with the same clusters used in j ]4.6.2| by 
implanting a 7-member test cluster at the same spatial position and 
colour normalisation Cni20 • We increase the test cluster Cb colour 
normalisation by 5cin20 and run the matching algorithm. This is re- 
peated until the detector classifies the reddened test cluster as an 
independent system. The resolving capability of the algorithm can 
be parametrised as x = Acm2o/o": the minimum sequence colour 
separation between the two detected systems relative to the width of 
the filters they were identified in. Small values indicate a good res- 
olution, and in all clusters tested against, we found x < 0.5. More- 
over, for all but two membership bins (Ngai=14,18) the test cluster 
was resolved within x < 0.25. Whilst in our real astronomical 
data we observe some cluster pairs overlapping in projected space, 
these examples exhibit large separations in both colour space and 
redshift. For example the two clusters MGB J234729-00080.4 and 
MGB J234733-00100.0 have redshifts of z = 0.23 and z = 0.53 
and Xr-i = 7.8. Although our analysis here could benefit from a 
larger sample size, ORCA can distinguish between two separate sys- 
tems even if their sequences lie in the same filter, subject to their 
colour separation being at least 1/4 the filter width. Below this level, 
their similarity in colour likely justifies classifying these systems as 
the same structure. 



5 COMPAMSON TO EXISTING CLUSTER DATA 

The positions of detected clusters can be seen in Figure[TT| with the 
location of maxBCG clusters ( |Koester et al.|2007at marked with red 
circles, and the positions of known X-ray clusters marked with blue 
squares. Clusters detected in the {g-r, r-i} combination are shown 
as blue filled cells, those detected in {r-i, i-z} are red filled cells. In 
each case the cluster BCG cells are yellow. 



5.1 The maxBCG catalogue 

The |Koester et al.| ( |2007a^ maxBCG catalogue of 13,823 optically 
selected SDSS clusters uses the detection algorithm described in 
|Koester et al.| ( |2007b ). This catalogue makes use of data from an 
earlier release of SDSS, so was unable to take advantage of the 
added depth Stripe 82 offered this study. Because direct comparison 
of the two cluster selection functions is both non-trivial and unfair, 
we do not attempt a full analysis in this study. However, in the spirit 



of matching detections made here to those of the shallower data in 
the |Koester et al.| ( |2007a^ catalogue, we include the positions of 
maxBCG clusters in Figure[TT|as a set of red circles. The centre of 
these circles is the location of the assigned Brightest Cluster Galaxy 
(BCG), whilst the radius corresponds to l/i~^Mpc calculated from 
the published photometric redshift estimate of the cluster. We stress 
however, that this does not necessarily correspond to the physical 
size of the cluster. 

The survey area contains 22 maxBCG clusters. For ease of ref- 
erence, salient details from that catalogue are reproduced in Table 
|4] along with a name of the form BCG JHHMMSS-l-DDMM.m. 
We attempt a simple match to the ORCA catalogue by looking for 
either common BCGs (and more generally a match to ORCA clus- 
ter members where BCGs are assigned differently) or statistically 
significant separations between ORCA centroids and maxBCG po- 
sitions. We find a match to 18 of the 22 clusters; the four maxBCG 
clusters that do not have ORCA analogues are noted in Figure [TT] 
with dashed circles and are apparent as two pairs with small angu- 
lar separation. 

We note the ORCA cluster {MGB J234341 +001 80.3) is situ- 
ated between the western pair (BCG J234322+00190.6 and BCG 
J234403+00130.6). Optical-band imaging (Figure[T6]in Appendix) 
shows evidence of early type galaxies distributed in a filamentary 
chain, approximate comoving length 2/i~^Mpc, sampled by ORCA 
between the maxBCG detections. 

The other pair {BCG J234106-^00120.4 and BCG 
J234122+00190.0) may be part of an elongated structure 
sampled by both the four maxBCG entries in that area and also 
by the ORCA detector. Figure [TT] shows the ORCA cluster MGB 
J234105-I-00180.3. This cluster centre, situated between the two 
maxBCG clusters, matches the centroid of an RASS cluster to 
within 0.4', with an uncertainty of ^ 1' in the X-ray source. 

Overall, we find very good agreement with the maxBCG cat- 
alogue of clusters, detecting 81% of their entries in the survey re- 
gion, rising to 100% when taking into account how the different 
algorithms handle systems that by eye resemble filamentary struc- 
ture. 

5.2 X-ray detected clusters 

X-ray selected cluster catalogues are useful independent checks on 
the population of clusters detected by optical cluster-finders. We 
use cluster data from the ROSAT All Sky Survey-derived (RASS; 
[Voges et al.|T999| NORAS jBohringer efaTjlOOOl and BCS cata- 



logues (for the latter, both main and extended catalogues; Ebeling 



erar][T998l [2000l l, the XCS jRomer et al.][200T] [Mehrtens et ~ 



20TT] l and BLOX (Dietrich et a l.|[2007) from XMM-Newton, and 



CHaMP (Barkhouse et al. 20061 from Chandra. We combine these 
datasets, taking care to identify any duplicate detections, to form 
an X-ray catalogue consisting of 1463 unique clusters. From this 
catalogue there are 58 X-ray clusters within the full 270 square de- 
gree footprint covered by Stripe 82, and two of these lie within the 
7 square-degree sample studied here. In future we will provide a 
comparison of these X-ray data to an optical cluster catalogue cov- 
ering a larger area. 

Blue squares in Figure[TT]show the position of the two clusters 
in the region we study here. The westernmost X-ray cluster, RXC 
J2337.6+0016 (also detected in the flux-limited Brightest Cluster 
Sample, pb eling et al .|I998[l is the X-ray counterpart to AC02631 
( |Abell, Corwin & 01owin|1989> and has a redshift of 0.2780 ( |Craw-| 
[ford et aL [1995^ . The X-ray position coincides with the ORCA 
detection of this system {MGB J233740-^00160.2; z=0.257I) at 
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Table 4. An extract from the |Koester et al.H2007a^ catalogue noting the 22 maxBCG clusters within the limits of this SDSS sample field. The cluster name 
follows the lAU JHHMMSS+DDMM.m format. The RA and DEC are J2000, and measured in degrees. Zphoto and Zspec are the estimated photometric and 
spectroscopic redshifts of the clusters. Ng^i is the number of members in the cluster, and N^^j'^" is the scaled richness. 



Cluster name 


RA 


DEC 


■^photo 


2spcc 


Ngal 


j>^R200 
gal 


BCG J233740+00160.3 






0.27138 


u. 


.zou 


0.277 


59 


88 


BCG J234624+00440.0 




.oyyoo 


0.74943 


n 




0.275 


25 


26 


BCG J233746-00420.2 




.44UU 1 


-0.70310 


n 




0.287 


20 


17 


BCG J234100+00040.9 




.z^yuo 


0.08161 


n 

u. 


194 


0.185 


23 


23 


BCG J233955-00250.0 


oo^ 


Q7Q1 Pi 


-0.43282 


n 

u. 


.z / 


0.277 


17 


15 


BCG J234548-01070.7 


356, 


.45068 


-1.12775 


0, 


.273 




18 


18 


BCG J234604-00100.0 


356, 


.51477 


-0.18283 


0, 


.254 




22 


22 


BCG J234322+00190.6 


355, 


.84039 


0.32587 


0, 


.257 


0.267 


38 


60 


BCG J234146+01070.5 


355, 


.44077 


1.12444 


0, 


.246 


0.251 


15 


11 


BCG J233919-00150.6 


354, 


.82941 


-0.25941 


0, 


.284 




14 


11 


BCG J234024-00050.6 


355, 


.10205 


-0.09300 


0, 


.281 




17 


13 


BCG J234720+00290.7 


356, 


.83487 


0.49456 


0, 


.286 


0.275 


12 


10 


BCG J233900+00420.0 


354, 


.75143 


0.71610 


0, 


.219 


0.183 


14 


11 


BCG J234122+00190.0 


355, 


.34253 


0.33330 


0, 


.284 


0.278 


22 


22 


BCG J23391 1-01 130.3 


354, 


.79459 


-1.22236 


0, 


.292 




14 


10 


BCG J234626-I-00430.7 


356, 


.60690 


0.72794 


0, 


.251 




25 


29 


BCG J234403+00130.6 


356, 


.01273 


0.22646 


0, 


.262 




16 


11 


BCG J234233-00170.3 


355, 


.63776 


-0.28873 


0, 


.275 




16 


14 


BCG J233755+00130.5 


354, 


.47760 


0.22478 


0, 


.262 


0.278 


37 


61 


BCG J233825-00090.2 


354, 


.60291 


-0.15397 


0, 


.270 




14 


11 


BCG J234737-00370.9 


356, 


.90375 


-0.63221 


0, 


.262 




14 


11 


BCG J234106-I-00120.4 


355, 


.27640 


0.20707 


0, 


.262 




15 


10 



a separation (AS, Az) of (0.1', 0.021). The easternmost X-ray 
cluster {RXC J2341. 1+0018) with a redshift of z=0.2766 ( (KatH 
Igert et al.|[T998l misidentified as AC02644) was originally opti- 
cally identified in [Goto et ST] ( |2002| and Lop es et al.| ( |2004"l l, and 
is in close proximity to MGB J234105+00180.3 (z=0.2588), with 
(A9, Az)=(0.4', 0.018). This latter match also appears to straddle 
two maxBCG clusters in the same region as the potentially elon- 
gated structure discussed in j ]5.1| 



6 PSl MOCK CLUSTER CATALOGUE 
6.1 Simulations 

In this section, we describe the application of ORCA to a mock PS- 
1 lightcone. Theoretical simulations allow one the luxury of com- 
paring clusters detected by the algorithm (ORCA clusters) to the 
galaxy membership of dark matter haloes (hereafter ACDM clus- 
ters). Simulated galaxies are allocated to dark matter haloes using 
the |Bower et al.| ( |2006| l semi analytic model. This approach makes 
the assumption a satellite galaxy is stripped of hot gas immediately 
following accretion onto a large halo. Star formation is halted after 
the cold gas reservoir is depleted, and the galaxy joins the red se- 
quence. Coupled with AGN feedback, this prescription reproduces 
the observed bimodality in galaxy colours. However a known flaw, 
the rate of gas depletion, results in redder than observed satellite 
galaxies. Recent treatments of ram-pressure stripping (e.g., |Mc-| 
|Carthy et al.|2008] l hope to improve understanding of the transition 
to early-type galaxies with improved semi-analytic models ( |Font| 
|etal.|2008 ; Benson & Bower|20i0t . 

Although mock surveys are inaccurate realisations of the uni- 
verse (see |Hilbert & White|2010[ for an example in a cluster detec- 
tion context), they can nevertheless serve as self-consistent tests of 
the detector We emphasise, however, there is little merit in compar- 



ing mock cluster detections with those in survey data until models 
can reproduce the observed group and cluster galaxy population 
with more fidelity. 

To compare ORCA detections to the model, we construct 
ACDM clusters with the aid of halo memberships and full 3D 
galaxy data. In each ACDM cluster, we calculate the approximate 
centre from cluster member positions. Outlier galaxies are iden- 
tified by rejecting 3a deviations from a bootstrap-estimated me- 
dian galaxy-centroid distance. Following outlier ejection, we find 
the resultant cluster sizes agree well with the virial radii of the host 
haloes. We set a minimum cluster mass limit by selecting ACDM 
clusters residing in haloes with Mh ^ lO^^/i~^M0. 



6.2 Mock reference cluster 

We select a "reference cluster" from a set of ACDM-based detec- 
tions generated from a preliminary scan of the simulation. The cho- 
sen cluster allows us to set the slope and width of the photometric 
filters in our search through the mock data. Candidate training clus- 
ters were identified from a redshift range bracketing Abell 2631 
(2 — 0.278), with similar memberships and a clear sequence in all 
colours. We selected the richest of these candidates, featuring 130 
members and a redshift of z = 0.3. By applying the same fitting 
techniques as those described in j ]3.7.1[ we set the filter parameters 
listed in Table [3] and apply the same colour ranges as those used 
on the SDSS. The fitted gradients are steeper in g-r and r-i than 
those used for the SDSS, and the filter widths are smaller These 
values were nevertheless consistent with the other candidate refer- 
ence clusters identified in the mock. As before, we use the most 
conservative width (g-r, 0.13) for filters in each colour. 
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Figure 11. Clusters detected in the Stripe 82 field. The coloured cells represent clusters detected in different colour pairs. Blue cells coirespond to clusters 
detected in {g-r, r-i} filter pairs, red clusters detected in {r-i, i-z} filter pairs. Yellow cells indicate the BCG position of each cluster. Red circles indicate 
the position of maxBCG clusters, based on data shallower than that used in the study here. Circle radii correspond to lh~^ Mpc, based on the maxBCG 
photom etric redshift estimate of the cluster. Dashed red circles indicate the four maxBCG clusters discussed in j ]5.1| that also feature gri-colour imaging in 
Figures|l6|and|l7| Blue squares note the position of ROSAT All Sky Survey X-ray sources, with half-lengths corresponding to lh~^ Mpc. 



Table 5. Filter parameters fitted from the mock reference cluster (by anal- 
ogy with those derived from Abell 2631) along with colour ranges searched 
by the detector (the same as those used in the Stripe 82 data). 



Colour 


Slope (/3) 


Width (cr) 


Range 


Filters 


g-r 


-0.070 


0.130 


0.47-2.00 


39 


r-i 


-0.032 


0.064 


0.00- 1.22 


38 


i-z 


-0.012 


0.035 


-0.10 - 1.10 


31 



6.3 Producing ACDM and mock ORCA cluster catalogues 

Except for the revised parameters listed in Table|5] the detector ran 
as described in SjS] and applied magnitude limits created a source 



catalogue of 80,536 mock galaxies. Because the algorithm relies 
on the detection of colour-magnitude ridgelines, we do not want to 
include ACDM clusters without detectable sequences. We therefore 
construct the ACDM cluster list from galaxies selected in the same 
photometric filters used by the detector, meaning ACDM clusters 
may also be detected multiple times. We group together ACDM 
clusters with common halo identifiers, but as before selected the 
highest reduced flux candidate as the "best" ACDM cluster. 



We found a total 

''^'■'h-^Mp,- at Mf 



of 305 

,14; 



ORCA clusters with Mh ^ 
lO"/i~'M0; at Mh ^ lO'^/i'^M© the counts are more equal. 
Although the majority of clusters identified are at 2 ~ 0.3, the tests 
we describe in the following section will highlight how well ORCA 
performs over this entire parameter space. Figure [12] shows a sim- 
ple comparison of the two catalogues by plotting both sets of clus- 
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RA (degrees) 

Figure 12. Clusters in haloes of mass ^ 10^^'^/i~^Mq from the mock ORCA cluster catalogue (cells) and the ACDM catalogue (circles). Cell colours 
correspond to clusters detected in different colour pairs. Blue cells are clusters detected in the {g-r, r-i\ filter pairs, red are clusters detected in {r-i, i-z}. 
Yellow cells indicate the BCG of each cluster. Crosses denote the ACDM cluster centre, and circle radii indicating the angular distance between the centre and 
most distant member. 



ters residing in haloes Mh ^ lO"'^/i"^M0 out to z = 0.6 (the 
highest cluster redshift in the SDSS cluster catalogue). Grey circle 
centres denote the position, and their radii the maximum member- 
cluster centre distance of ACDM clusters. Blue and red cells repre- 
sent ORCA clusters detected in {g-r, r-i} and {r-i, i-z\ respectively. 

6.4 Performance of the algorithm 

To determine how well the detector recovers and characterises the 
mock clusters, we illustrate here three simple tests to quantify the 
detection performance. 



6.4.1 Completeness 

We define completeness as the number of detected haloes as a 
function of halo mass and redshift. A halo is detected if at least 
Nmin galaxies are identified, even if they are shared between multi- 
ple ORCA clusters (for example, fragmenting a halo when the algo- 
rithm attempts to identify substructure). We compare this number 
to ACDM cluster counts (by definition unfragmented), with at least 
Nmin members. 

The fraction of detected ACDM clusters can be seen in Figure 
|13[ where we produce a grid of cells with sampling intervals of 0.05 
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Figure 13. Completeness of mock ACDM clusters. The fraction of cor- 
rectly detected clusters from the ORCA catalogue as a function of halo mass 
and redshift. The white regions indicate where there were no ACDM clus- 
ters in that bin. 



in redshift and 0.2 in logio halo mass. Because in some cases only a 
few detections occupy each cell, some regions will suffer from shot 
noise. We smooth the data using a 3 x 3 grid so the completeness 
for a given cell is the mean completeness over this region. Empty 
regions in Figure[T3]therefore indicate where either no ACDM clus- 
ters exist or too few clusters are found to reliably calculate the com- 
pleteness (we set a threshold of at least five clusters detected over 
the 3x3 grid). Between 0.1 5S z ^ 0.4, the detector attains at 
least 68% completeness for halo masses above lQ^'^'^h~^ Mq, and 
is over 90% complete in halo masses exceeding 10"-^/i"^Mq. 
This compares favourably with the maxBCG algorithm applied to 
mock simulations, where |Koester et aL]j2007b| l report > 90% com- 
pleteness between 0.1 ^ z ^ 0.3 for Mh ^ lO^^'^/i'^Mg with 
clusters containing at least 10 members (cf. Nniiii=5 in this study). 
Applying the completeness definition and the same selection crite- 
ria as that study, the ORCA detector is > 90% complete down to 
a halo mass of lQ^^'^h~^ Mq. These results also compare well to 
the Voronoi Tessellation completeness of the 2TecX ( [van Breuke-| 
|len & Cle wley 2009 1 algorithm, either matching or exceeding their 
stated completeness for Mh = 10^^ '' and lO^^/i'^M© up to our 
redshift limit. 

At higher redshifts there is a decline in completeness where 
there are only a few members brighter than the magnitude limit, 
reducing the algorithm sensitivity to distant clusters. This effect 
is more apparent among the lower mass haloes. At high redshift 
(z > 0.4) and low mass (Mh < lO^^'^/i'^M©) there are 12 
ACDM clusters, but the detector identifies only two of these. We 
also note a local incompleteness at z ^ 0.08. Arising from our 
choice of probability threshold (Pthrcsh), too few overdense cells 
are selected in filters featuring low signal-to-noise clusters. The 
photometric filters best suited to detecting local, relatively blue 
clusters have galaxy populations dominated by the blue cloud com- 
ponent of the colour-magnitude relation. Successful detections in 
this crowded field are compounded by the larger scale-size of more 
local clusters such as the local {z = 0.03) seven-member group 
at the north-western boundary of the catalogue in Figure [12] Un- 
der these circumstances, it becomes unlikely cluster Voronoi cells 
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Figure 14. Stellar mass accuracy. The fraction of recovered stellar mass in 
mock clusters as a function of halo mass and redshift. 



share common vertices, restricting potential membership links be- 
tween them. 

We classify spurious detections in the mock cluster catalogue 
as those clusters where each member belongs to a different halo. 
Of the 305 ORCA clusters, only two fit this description, suggesting 
a spurious detection rate (0.7%) consistent with tests performed in 



6.4.2 Stellar mass accuracy 

Stellar mass accuracy is the stellar mass of an ORCA cluster rel- 
ative to that of the ACDM cluster belonging to the same halo. 
Because the algorithm may split the halo galaxies into multiple 
clusters, we combine the mass of all ORCA clusters sharing the 
same halo. In ACDM clusters with up to ~12 members (approx- 
imately 75% of the catalogue), over half of the total cluster stellar 
mass comes from the two most massive galaxies. The efficient de- 
tection of these galaxies is therefore essential in gaining accurate 
estimates of cluster stellar masses. The stellar mass accuracy for 
each ACDM cluster is A, = Mf/Ml^''^, where M^;' is the stellar 
mass of all ORCA cluster members registered to the ACDM clus- 
ter's halo. We apply the same gridding technique discussed in the 
previous section, requiring at least 5 clusters in a grid to define a 
reliable A,. As Figure [T4] shows, between 0.1 z ^ 0.4 the al- 
gorithm recovers over half of the cluster stellar mass for systems 
with halo masses of at least lO^'^'^/i'^M©. This recovery fraction 
improves with increasing mass, reaching 90% in some cases. Both 
local and distant clusters suffer from lower stellar mass estimates. 
For the former, higher levels of halo fragmentation (one halo being 
assigned to many ORCA clusters) result in galaxies lost to nearby 
systems with densities or memberships too low to qualify as clus- 
ters. Those systems with redshifts z > 0.5 tend to be unfragmented 
but contain fewer members, causing an underestimation of cluster 
stellar mass. The stellar mass accuracy at the median redshift of 
the survey (z = 0.33) remains above 50% down to halo masses 
of lO"'^/i"^M0, and above 75% from masses of lO^'^/i'^M©, 
suggesting the detector performs well in estimating the true cluster 
stellar mass content. 



18 Murphy et al. 




Redshift (z) 



Figure 15. The purity of ACDM clusters detected by the ORCA algorithm. 
Low values indicate where clusters have included a large number of con- 
taminating galaxies not belonging to the halo. 

6.4.3 Purity 

As discussed in j ]6.4. 1[ a halo is detected by the algorithm if it 
finds at least Nmin members that have been allocated to ORCA clus- 
ters. For a cluster with 7 members, the distinction between a cluster 
containing 5 halo galaxies and 2 interlopers and one containing 7 
halo galaxies provides a measure of cluster purity. We define pu- 
rity as the fraction of galaxies ORCA assigned to the cluster that 
are members additionally belonging to the host halo. This descrip- 
tion is in line with the purity described by |Koester et al.|j2007b| l. 
However, we decide not to adopt a threshold above which a cluster 
is considered pure, instead directly assigning each cluster a purity 
fraction. Figure [Tsjshows the purity of ORCA clusters with varying 
redshift and halo mass, the gridding method here being the same 
scheme introduced in j ]6.4. 1| ORCA clusters are at least 70% pure 
at the median redshift of the survey over all halo masses. The purity 
appears to drop at higher redshifts, attributed to faint but genuine 
cluster members being replaced by brighter contaminants that lie 
on the cluster sequence. Relative to the completeness and stellar 
mass estimates, cluster purity is not as sensitive to halo mass. This 
is most likely a consequence of the membership incompleteness 
discussed in j ]3.7.2| Because peripheral members are less likely to 
be in Voronoi cells tagged as statistically significant, the inclusion 
of interlopers at cluster edges is reduced. As in the previous sec- 
tion, increased halo fragmentation drives the local drop in purity, 
serving to increase the contamination fraction by distributing the 
halo galaxies among local clusters and systems failing to achieve 
cluster status. 



7 SUMMARY 

We present and demonstrate a new cluster detection algorithm 
based on red-sequence cluster searches, the detection of overdensi- 
ties using Voronoi Tessellations, and connecting galaxies into clus- 
ters with a Friends-of-Friends algorithm. With this approach, we 
make only two assumptions about the systems we are looking for: 
that they have detectable red-sequences, and are overdensities in 
the projected plane of the sky. 

We calibrate the photometric selection filters to a rich Abell 



cluster found in SDSS data, and find that recovery of members from 
both this large cluster and a small group is largely insensitive to the 
choice of two algorithm parameters controlling the behaviour of 
the algorithm. When applying the algorithm to a sample of SDSS 
Stripe 82 galaxies with four bands, we find 97 clusters. Based on 
spectroscopic and photometric redshifts, we estimate these clusters 
are detected out to 2; = 0.6 and the catalogue has a median red- 
shift of 2: = 0.31. We perform false-positive tests suggesting the 
spurious detection frequency is below 1%. Tests on the catalogue 
suggest the detector is robust to sparsely sampled cluster fields and 
is not overly sensitive to survey edges. In comparing our data to 
existing optical and X-ray clusters, we find good agreement with 
the maxBCG and RASS catalogues in the same region. 

We go on to test the performance of the detector with a mock 
survey generated from a semi-analytic galaxy formation model. In 
comparing the ORCA cluster detections to those generated from 
halo membership data, we make a quantitative assessment of the 
detector performance. The algorithm identifies 305 clusters, whilst 
the simulation produces 414 down to a halo mass of lO"/i"^A/0. 
At the median redshift of the catalogues (both z — 0.33) we 
find ORCA is 75% complete down to a cluster halo mass of 
1O" */J"^M0 and is able to recover approximately 75% of the 
total stellar mass for clusters in haloes of at least 10^'^'^ Mq. 

We have demonstrated this algorithm is capable of identifying 
clusters in both real and simulated data with minimal assumptions 
as to the nature of clusters. In combining comprehensive colour 
scans to search for cluster red-sequences with Voronoi diagrams 
to estimate surface densities, we avoid making model-dependent 
decisions about what a cluster is. Cluster redshifts arise as a conse- 
quence, not condition, of our detection, affording additional free- 
dom from model SEDs and the uncertainties inherent in photo- 
metric redshift data spanning the depths, fluxes and areas set to 
be commonplace in next-generation galaxy catalogues. This detec- 
tor can be used in any survey where there are at least two photo- 
metric bands, but is most powerful when applied to multi-colour 
surveys such as the forthcoming Pan-STARRS surveys. The scope 
for cluster detection with ORCA is not limited solely to the optical 
regime. Preliminary tests with optical-IR band-merged catalogues 
show great promise, requiring minimal adaptation to facilitate the 
detection of the 4000A break into the IR bands and beyond z = 1. 
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APPENDIX: CLUSTER IMAGES 




Figure 16. Stripe 82 cluster MGB J23434 1+001 80.3 is an extended system detected between two maxBCG clusters (BCG J234322+00190.6 and BCG 
J234403 +001 30.6). For clarity, we have not plotted the Voronoi grid, but the cluster members are marked with blue cross-hairs. The maxBCG clusters are shown 
in red, with the central positions noted by the two smaller circles, and the larger circles corresponding to radii of lh~^ Mpc based on the photometrically- 
estimated cluster redshift from |Koester et al^p007a^ . 
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Figure 17. Stripe 82 cluster MGB J234I05+00180.3: an ORCA detection between two maxBCG clusters and on top of an X-ray cluster position. Members and 
their Voronoi cells are marked in blue, the thick circle indicating the estimated cluster centre. Grey dashed circles are associate cluster members arising from 
multiple detections of this cluster (p!6). Red data indicate the location of maxBCG clusters BCG J234122+00190.0 and BCG J234106+00120.4, with larger 
circles indicating a l/i~^Mpc radius, smaller circles the BCG positions. Yellow data indicate the NORAS X-ray clustev RXC J2341. 1+0018; the half-length 
of the large square coiresponds to 1/i^^Mpc based on the cluster redshift, the small square noting the X-ray position, uncertain to approximately 1'. The 
X-ray-ORCA centroid separation is approximately 0.4'. 



